[SPARK-17100] [SQL] fix Python udf in filter on top of outer join#15103
[SPARK-17100] [SQL] fix Python udf in filter on top of outer join#15103davies wants to merge 2 commits intoapache:masterfrom
Conversation
|
Test build #65402 has finished for PR 15103 at commit
|
|
Test build #65502 has finished for PR 15103 at commit
|
| val emptyRow = new GenericInternalRow(attributes.length) | ||
| val v = BindReferences.bindReference(e, attributes).eval(emptyRow) | ||
| val boundE = BindReferences.bindReference(e, attributes) | ||
| if (boundE.find(_.isInstanceOf[Unevaluable]).isDefined) return false |
There was a problem hiding this comment.
It looks like there's a similar hasUnevaluableExpression check in the ConvertToLocalRelation optimizer rule, except that rule also carves out a special-case exception for AttributeReference (see #13418 (comment) for discussion).
I guess we don't want to do that exemption here, though, since we're actually having to evaluate the rule and the attribute reference won't be replaced before evaluation.
There was a problem hiding this comment.
Also I guess that the AttributeReferences should have been replaced following binding in the previous line, so this looks good to me.
|
LGTM |
|
Merging into master and 2.0 |
## What changes were proposed in this pull request? In optimizer, we try to evaluate the condition to see whether it's nullable or not, but some expressions are not evaluable, we should check that before evaluate it. ## How was this patch tested? Added regression tests. Author: Davies Liu <davies@databricks.com> Closes #15103 from davies/udf_join. (cherry picked from commit d810415) Signed-off-by: Davies Liu <davies.liu@gmail.com>
## What changes were proposed in this pull request? In optimizer, we try to evaluate the condition to see whether it's nullable or not, but some expressions are not evaluable, we should check that before evaluate it. ## How was this patch tested? Added regression tests. Author: Davies Liu <davies@databricks.com> Closes apache#15103 from davies/udf_join.
What changes were proposed in this pull request?
In optimizer, we try to evaluate the condition to see whether it's nullable or not, but some expressions are not evaluable, we should check that before evaluate it.
How was this patch tested?
Added regression tests.